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Optical tweezers and AFM cantilevers are often calibrated by fitting their experimental power- 
spectra of Brownian motion. We demonstrate here that if this is done with typical weighted least- 
squares methods the result is a bias of relative size between — 2/n and +l/n on the value of the fitted 
diffusion coefficient. Here n is the number of power-spectra averaged over, so typical calibrations 
contain 10 20% bias. Both the sign and the size of the bias depends on the weighting scheme 
applied. Hence, so do length-scale calibrations based on the diffusion coefficient. 

The fitted value for the characteristic frequency is not affected by this bias. For the AFM then, 
force measurements are not affected provided an independent length-scale calibration is available. 
For optical-tweezers there is no such luck, since the spring constant is found as the ratio of the 
characteristic frequency and the diffusion coefficient. 

We give analytical results for the weight-dependent bias for the wide class of systems whose 
dynamics is described by a linear (integro-)differential equation with additive noise, white or col- 
ored. Examples are optical tweezers with hydrodynamic self-interaction and aliasing, calibration of 
Ornstein-Uhlenbeck models in finance, models for cell-migration in biology, etc. Because the bias 
takes the form of a simple multiplicative factor on the fitted amplitude (e.g. the diffusion coefficient) 
it is straightforward to remove, and the user will need minimal modifications to his or her favorite 
least-square fitting programs. 

Results are demonstrated and illustrated using synthetic data, so we can compare fits with known 
true values. We also fit some commonly occurring power spectra once- and- for- all in the sense 
that we give their parameter values and associated error-bars as explicit functions of experimental 
power-spectral values. 
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I. INTRODUCTION 

Optical tweezers (OTs) are often calibrated by fitting 
the power spectrum of a trapped bead's Brownian mo- 
tion with a Lorentzian [1-5]. Similarly, atomic force mi- 
croscopes (AFMs) are sometimes calibrated by fitting the 
power spectrum of the cantilever's Brownian motion with 
the power spectrum of a damped harmonic oscillator [6- 
8]. These fits are routinely done by least-squares (LSQ) 
minimization, the premises of which are rarely satisfied 
in practice. Here we do Maximum Likelihood Estimation 
(MLE) of parameters, the premises of which are satisfied, 
and give the parameter values, with error bars, as explicit 
functions of the experimental power spectrum in an excel- 
lent approximation. These closed-formula results should 
be useful for on-line calibration, since that requires high- 
speed determination of parameters. They also demon- 
strate that typical cahbrations using least-squares fitting 
contain 10-20% systematic errors, also known as bias. 

We find quite generally that both the sign and the size 
of the bias depends on the details of how a least-squares 



fit is implemented through the choice of how the data- 
points are weighted. Analytical expressions for the bias 
are given and examples of the behavior of the stochastic 
fit-errors are examined numerically. 

These results apply beyond the examples given here, 
since the same bias occurs in all systems described 
by a linear (integro-) differential equation with addi- 
tive noise. Thus, using the correction factors given in 
Eqs (32) and (41), it is possible to obtain correct unbi- 
ased estimates for the fit parameters without performing 
a computationally expensive MLE. Instead, one simply 
adjusts the results of an ordinary least-squares fit. 

We also show that the bias found for least-squares fits 
appears under fairly general conditions, independent of 
the details of the function that is fitted, but depending 
only on the distribution of the error on the data. Analyt- 
ical correction factors arc given for the examples of Gaus- 
sian, exponential, and gamma distributed errors. Finally, 
a general criterion is given for when one may expect a 
least-squares fit to be biased: When the experimental 
mean is correlated with the experimental variance. Con- 
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verscly, if they arc uncorrelated the fit is unbiased. 

The paper is organized as a main text followed by six 
appendices that contain proofs and other technical de- 
tails. 



II. DYNAMICS 

The equation of motion for a massive particle moving 
in a harmonic potential under the influence of thermal 
forces is 



mx{t) + 7x(t) + Kx(t) = i^thorm(t) 



(1) 



Here x{t) is the coordinate of the particle as function of 
time t, m its inertial mass, 7 its friction coefficient, k 
is Hooke's constant, and -Fthcrm is the thermal force on 
the particle. This force is random, and assumed to have 
white-noise statistical properties, 



(J^therm(t)) 
(-Ftherm (t) ^therm {t' ) ) 







(2) 



2fcBT7 5(t-i'), for all t,t' 



where 6 is Dirac's delta function, ksT the Boltzmann en- 
ergy, and (•) is the expectation value with respect to the 
noise. This theory's power spectrum of thermal motion 
is derived in [9]. 



III. POWER SPECTRA 

Optical tweezers should be calibrated using the hy- 

drodynamically correct power spectrum given in [9, 10], 
possibly taking into account a number of effects listed 
there and further detailed in [11-13]. These effects in- 
clude: (i) The frequency dependence of the friction coeffi- 
cient, due to hydrodynamic self-coupling; (ii) dependence 
of the friction coefficient on distance to nearby surfaces, 
due to hydrodynamic coupling; (iii) extra 1// power at 
low frequencies, caused by the laser pointing stability, 
hydrodynamic self-coupling, etc.; and (iv) opti(;al inter- 
ference effects, caused by a standing wave between the 
trapped object and the nearby microscope cover-slip sur- 
face, when determining the displacement sensitivity (Volt 
to nano-meter conversion factor). Generally, the most 
crucial step, where the largest systematic errors are likely 
to appear, is the determination of the displacement sensi- 
tivity [10]. However, as also described in [9], there are sit- 
uations in which an acceptable approximation is achieved 
by fitting a Lorentzian, 



Pf 



(3) 



to an average pj'^'^-' of, say, n two-sided experimental 
power spectra for the Brownian motion of a trapped 
microsphere (occasionally confusion arises over extra or 
missing factors of two in PSDs — this is down to the use of 
one-sided, / > 0, versus two-sided, / ^ 0, PSDs; as long 



as the total power is contained in the chosen frequency 
range the two approaches arc equally correct). The no- 
tation used is essentially the same as in [9], where the 
aliased Lorentzian is also derived, i.e., /c = k/{2'k^) and 
D = kBTh. 

Similarly, AFM cantilevers are sometimes calibrated 
by fitting a frequency interval around the resonance peak 
[8] using 



D/(2^2) 



if,nfi-pr+p 



(4) 
(5) 



to a similar average of experimental power spectra for 
a cantilever's Brownian motion. Here, the character- 
istic frequency /q = K/(47r^m) and the quality factor 
Q = y/rrm/j. Describing the AFM cantilever as a simple 
harmonic oscillator is rigorously correct for high quality 
factors Q ^ 1, e.g. in air where dissipative forces are 
small [8]. In water, this description also works, at least 
for short stiff cantilevers, but now the drag coefficient is 
frequency dependent [7] . It is not our aim here to review 
the expansive literature on AFM calibration — the only 
point we seek to make, is that if the amplitude of the 
power spectrum is involved, systematic fitting errors are 
typically present as shown below. 

Both these theoretical power spectra follow from the 
Einstein-Ornstein-Uhlenbeck theory for the Brownian 
motion of a damped harmonic oscillator in one dimen- 
sion; see Eq. (1) and [9], where the parameters in Eqs. (3) 
and (4) are also defined. In Appendix B we give the re- 
sults for the aliased AFM power spectral density (PSD). 

Least-squares fits of power spectra are biased, as shown 
below in Section V, irrespective of whether one applies 
the above simplified theory or a more complete one that 
takes into account aliasing, hydrodynamics, electronic fil- 
ters etc. This bias will be on the amplitude only and 
will not affect shape parameters, i.e., D will be biased, 
whereas fc, /o, and Q will not. 



IV. STATISTICAL PROPERTIES OF 
EXPERIMENTAL POWER SPECTRUM 

Here, and throughout the rest of the paper, we differ- 
entiate between theoretical expectation values, indicated 
by brackets, (•), and experimental averages, indicated by 
a bar, ~. Following the notation of [9] , we write the ther- 
mal force in Eq. (2) in terms of a normalized white-noise 
process ri{t) 



i^therm(i) = V2k^ v{t) 

whose Fourier transform obeys 



(6) 



(7) 
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where k,£ = ~N/2 + l,...,N/2 are integers. More- 
over, since ri(t) is an white-noise process, hence tempo- 
rally uncorrelated, Re fjk and Im fik are two mutually in- 
dependent random variables, uncorrelated for different 
A; > 0, with Gaussian distributions by virtue of the Cen- 
tral Limit Theorem, or, equivalently, by virtue ofr]{t) be- 
ing the first derivative of a Wiener process with respect 
to time. Consequently, the sum of their squares |77fcp is a 
non-negative random variable, uncorrelated for different 
A; > 0, with exponential distribution [17]. Hence, so are 
the experimental values P^'^^^ for the power spectrum for 

/ = A:/f,„sr > 0. 

Thus the dynamical theory defined in Eqs. (1) and (2) 
predicts not just the expectation value Pj for the exper- 
imental spectrum Pj''^\ as given in Eq. (4). It predicts 
also the distrihution from which the experimental power 
spectrum is "drawn": The power-spectral value Pi'^^^ 
at each frequency / is an independent random number, 
drawn from an exponential distribution with expectation 
value Pf, 



Pf{n 



l)/n, the variance, Pf/n, is the product of the 



mean and the scale, and the skewness is 2/y/n. From 
the known distribution it is easy to show that the g'th 
moment of P^'^^^ is 



^p(ex)q^ 



T{n + q) 
nir{n) 



(15) 



a result that we shall need later. 

In the limit n — >■ oo this distribution approaches a 
Gaussian by courtesy of the Central Limit Theorem, but 
it is a slow approach, because the starting point, the 
exponential distribution, is highly skewed. For moderate 
values of n, Pn{P'f^'^ ; Pf) is far from Gaussian, see Fig. 1. 
It is also quite skewed, but this is not the source of the 
bias. The bias is caused by the correlation between the 
experimental mean, P^^^^ , and the experimental variance 



n 

!/p(ex)x _ ^ Y^i-pCex) p(ex)N2 

i—1 



(16) 



(ex). 



PiPf 

Consequently, 



P/) = -exp(-P)'^''V^'/) 



^pjex)^ 



Pf 



(8) 



(9) 



aiPi^-^)^{iPp>^Pfr)y^^P, , (10) 

and the signal-to-noise ratio {P^'^^^)/a{P^'^^^) equals one. 
This is why we average over n experimental spectra, 



3(ox) 



3(ex) 



1 " 



n ^-^ 

i=l 



(11) 



before plotting and fitting: To reduce noise. 

If the n spectra are statistically independent — as is 

the case if they arc computed from data taken in non- 
overlapping time intervals — then we have, unchanged, 
that 



^pjex)^ 



but 



a(Pf-')=a(P)^'=))/V^ 



(12) 



(13) 



as discussed in greater detail below. The latter is easily 

/ 



shown to satisfy (s^(P)'"^)) = ^^(pj""^^ 




FIG. 1: The gamma distribution, Eq. (14), as function of 

pjcx)yp^ for various values of the shape parameter n: Thick 
full line, n — 1; full line, n — 2; thin full line n = 16. Also 
shown is the Gaussian limit value, plotted here with a variance 
of 1/n = 1/16 (dashed line). 



and P^'^^^ is distributed according to a distribution that is 
the convolution of n identical exponential distributions, 
viz. the gamma-distribution: 



Pn(Pf^P/) 



j(ex)n-l(n/P/)' 



r(n) 



exp f-nPHZ-Pj 



(14) 

where n is the shape parameter, P//n the scale param- 
eter, and r(ri) = (n — 1)!. The mean, Pf, is the prod- 
uct of the shape and the scale parameters, the mode is 



A. Ubiquity of the gamma-distribution 

As a matter of fact, all systems described by a linear 
(integro-)differential equation with additive noise, be it 
white or colored, have power spectral values that are de- 
scribed by the statistics in Eq. (14). To see that white 
and colored noise lead to the same statistics, all we need 
to realize is that the power spectral density of a colored 
noise is the product of the power spectral density of a 



4 



white noise and some function F{f) that describes the 
color of the noise (a filter function): Quite generally, if 
the dynamics of x is given by a polynomial Q in d/dt 
with constant coeflScients 



Q ( ^ ) xit) = m , 



(17) 



where V is a colored noise, then the power spectrum 



|Q(-z2^/)|2 |0(-i27r/)|^ 



(18) 



is seen to be the product of the power spcctriim of a 
white noise rj and some function of /. Thus, for a given 
frequency /, the statistic;s of the power spectral value is 
determined by the statistics of white noise. 



calculated from independent, identically Gaussian dis- 
tributed random variables are statistically independent. 
In this Gaussian scenario it is possible to assign mean- 
ingful confidence intervals to the fit-parameters and cal- 
culate the goodness of the fit, also known as its support 
or p-value, and this is why one is sometimes tempted to 
assume that data are Gaussian even when it is not quite 
the case. Finally, with Gaussian data MLE and WLS 
are mathematically identical as it easily seen by insert- 
ing a Gaussian in p„'s place in Eq. (50) and deriving the 
corresponding cost function, which is simply x^- 

We now ask what the effect is on the estimate of the 
fit-parameters if the data are not Gaussian, the weights 
are not independent of the experimental value, and the 
model is a non-linear function of the fit parameters. All 
of these circumstances are realized when fitting power 
spectra with the statistic given in the previous section. 



V. LEAST-SQUARE FITTING 

As alluded to in the previous section, one effect of the 
exponential distribution of Pj'^^ ^ is that weighted least- 
square fitting to pj'^^^ does not follow the text-book be- 
havior and returns a result that is systematically wrong, 
or, biased. In the following we remind the reader of the 
rather narrow set of circumstances under which least- 
square fitting works. We then relax the requirements 
and see what efi'ect it has on the fit results. 

Weighted least-squares (WLS) fitting minimizes 



x'(^) = E[2'^-/'W]'^' 



(19) 



i=l 



with respect to fit-parameters 9. Here, yi = f + are 
the experimental data (dependent variable), errors, fi 
the fit-function evaluated at i (the independent variable), 
and Wi are weights. It is assumed that the experimen- 
tal data are uncorrelated and that the weights are un- 
correlated with the experimental data. If, in addition, 
the weights are also independent of i, and thus constant, 
weighted least squares reduce to ordinary least squares 
(OLS). When the theoretical model, fi, is a linear func- 
tion of the fit parameters, 9, both OLS and WLS mini- 
mization are known to return parameter estimates that 
are BLUE (best linear unbiased estimate). The former 
was first shown by Gauss in 1829 whereas the latter was 
shown by Aitken in 1935 [14]. Specifically, in WLS the 
errors do not have to be independent, just uncorrelated, 
and in OLS they do not have to come from the same 
distribution, they are only assumed to have the same 
variance (homoscedasticity) . 

Things turn particularly simple when the experimen- 
tal data are Gaussian distributed. We can then choose 
Ui, e.g., as the average of several measurements, and the 
weights, Wi, as the reciprocal of the experimental stan- 
dard deviation. With this choice, yi and Wi are inde- 
pendent, because the sample mean and sample variance 



A. Some analytical results for bias in lecist squares 
fitting 

An estimator, 9, is biased if its expectation value differs 
from the true value, {9) ^ 9*, of the quantity it is an 
estimator for. Below, we show that some common least- 
square estimators for power spectra are biased. 

At the minimum of the function given in Eq. (19), 
the first derivative with respect to the fit parameters is 
zero, and the stationarity conditions thus reads 



N 



N 



^[Vi - fi] Wi defi = ^[Vi - fif Wi 



gWi 



(20) 



Notice, that in order to be completely general we have 

allowed the weights to depend on the fit parameters. We 
now treat a few specific, but universal, scenarios one at 
a time. 



1. Experimental standard deviations as weights 

Proceeding as usual with classic WLS, we pick the 
weights to be inversely proportional to the experimen- 
tal estimate for the standard deviation. This estimate 
could be the known uncertainty from the experimental 
apparatus, but often it is estimated simply as 



1 " 

n,i = A _ 1 ~ Vi)^ ) 



(21) 



where n is the number of times the experiment is repeated 
with the independent variable set to its ith value. The 
experimental average (sample mean) 



1 " 



(22) 
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is our estimate for the expectation value of yi. Doing this 

for each «, 



N 



i=l 



The stationarity conditions then reduce to 



AT 



N 



=1 ^n,i 



(23) 



(24) 



the solution of which gives us our estimate, 9[y], for 9 
for a given data-set y = {yi,j)i=i,...,N;j=i,...,n- Note, be- 
cause A'' and n are finite the estimate for 6 resulting from 
Eq. (24) is a stochastic quantity and will vary from one 
experimental realization to another. 

To calculate the bias of the estimator for given n, we 
need to find its expectation value for that n. We do this in 
two steps: First, we let the number of data-points — >■ 
oo, while keeping n and the number of fit-parameters 
fixed, as in an infinite experiment. In this limit the fit 
returns an estimate 0* of 9 that is no longer a fluctuating 
quantity, but may still depend on n. So, for A'' = oo 
Eq. (24) reads 



WDdeWl) . (25) 



Note, that the experimental values, ?7i and Sn,i, arc still 
fluctuating quantities described by the same statistic as 
before, whereas fi{9n) and defi{9^) are not fluctuating 
because they are functions of non-fluctuating variables. 
So, when in the next step we take the expectation value 
of Eq. (25), only y^ and s„^j are affected 



E 



doM9*J=J2 



i=l \"na / 

which is solved by 




(26) 



W*n) = JTtH^ for alH = 1, . . . , oo , (27) 

if a parameter set 0* exists, which solves all these many 
equations simultaneously. Provided that we know the 
distribution of yi, we can calculate the expectation val- 
ues at the right-hand side of this equation and thus de- 
termine whether the fit is biased. For a start, note that 
if the sample mean and the reciprocal sample variance 
are uncorrelated, then the numerator on the right-hand- 
side factorized to (yi)(l/s^i)- Thus the right-hand-side 
equals {yi), i.e. the fit is unbiased. So it is sufficient that 
mean and reciprocal variance are uncorrelated to ensure 
an unbiased fit. 

It turns out that it is also a necessary condition, if there 
is no redundancy in the parameterization of /j by 6, i.e., 
if their relationship is one-to-one. This is the only sensi- 
ble way to parameterize a function, and easily achieved 



by elimination a possible redundancy through reparam- 
eterization. Assuming this one-on-one relationship, we 
use that fi{9*) = {yi) and Eq. (27) to write 



fm = 



(1/4.) 



fi{K)^ for alH = l,...,oo 



(28) 



We now see that the fit is unbiased if and only if the term 
in square brackets in Eq. (28) equals unity. 

In summary, a necessary and sufficient criterion for 
least-squares fitting to he unbiased is that the sample 
mean and reciprocal sample variance are uncorrelated. 
Note that 'uncorrelated' is a weaker requirement than 
'independent' and although the latter implies the for- 
mer, the reverse is not generally true. Naturally, when 
the sample mean and sample variance are independent, 
the sample mean and reciprocal sample variance are also 
uncorrelated. 

Notice that we nowhere used what s„,i is, and in fact 
a more general version of Eq. (28) is 



iVi) 



iViwf) 



(29) 



where Wi is assumed independent of 9 but is otherwise 
unconstrained. The criterion for an unbiased estimator 
is now simply that the data, yi, and the squared weights, 
wf , are uncorrelated. 

Under some circumstances it is possible to determine 
the bias of the fit parameters from the bias of the fit- 
function given in Eq. (29). If the function is invertible 
this is trivially the case. But, it is also possible if the 
amplitude of the fit-function is determined by just one 
of the fit-parameters and the term in the square brackets 
is independent of i. In this case, bias can be removed 
from the parameter estimate, 9, by simply multiplying 
the amplitude-parameter by the square bracket and leav- 
ing the other parameters unchanged. In the next section 
we give an example of this. 



2. Experimental averages as weights 



If the sample standard deviation is proportional to the 



sample mean. 



oc y., 



for each i, the mean and variance 



are clearly not uncorrelated and we expect the parameter 
estimate to be biased. From Eq. (28) we find 



MO*) 



2\ 1 



(30) 



As a real-world example, consider gamma-distributed 
experimental data. We already mentioned that the 
gamma-distribution describes all power-spectra resulting 
from linear dynamical equations driven by an additive 
noise, white or colored. Carrying on as before, we use the 



6 



standard deviation to weight the data, Wf ~ l/sn{pj.^^^), 
where / is the frequency. Since the standard deviation 
and the mean are proportional, see Eq. (13), we have 
Wf oc Using Pj^^'^'s known gamma-distribution, 

see Eq. (14), we can compute ((-Pj"^')^'') for 5 = 1,2,... 
and find the value of the bracketed bias-term in Eq. (30) 



(e.),((^r)-^) 



{pp>) 



((p(-))-l) 



n/(n- 2) 



(31) 



That is, weighted least squares fitting of this wide class 
of power-spectra has a built-in, frequency independent, 
multiplicative bias of for n > 2. The true power 
spectral form can then be obtained from the least-square 
fit as 



n-2 



(32) 



In this scenario, the theoretical iV — >■ 00 limit that we 
took in going from Eq. (24) to Eq. (25) corresponds to 
letting the measurement time (or sampling frequency) 
become infinite while keeping the sampling frequency (or 
measurement time) and all other experimental factors 
fixed. 

In the simple case of a Lorentzian, Eq. (3), we have 
r = (A/c)and 



D/{2^^) n D*J{2^^) 



f,^ + f2 n-2iflJ^ + f 



2 ' 



(33) 



from which we see that /*„ is unbiased. In other words, 
the true value of the fit parameters can be obtained from 
the WLS estimates as 



D = D* « 

n-2 " n-2 



/c — fc.n ~ /< 



(Isq) 



(34) 

(35) 



where D^^'^i) and /c'-''^'^' are the stochastically fluctuating 
values returned by a least-square fit to a finite N data- 
set. This is a general feature of the power-spectra: They 
can be written as the product of an overall multiplicative 
scale-factor and a shape-function depending on /, where 
only the scale-factor is influenced by the bias. 



and the stationarity equations Eq. (20) become 



N 



N 



E dofi = E Vi 90 fi ■ (37) 



i=l 



Repeating the arguments from the previous section, the 
expectation value of the stationarity conditions in the 
limit — 00 is 



^ivi) fi\K) defi{ei) = Y,{y1) fi\K) dofiie*J , 

i=l i=l 

(38) 

which is solved by 

/i(C) = foralH = l,...,oo , (39) 

if a parameter set 0* exists, which solves all these many 
equations simultaneously. The true value, fi{9*), can in 
that case be written as 



MO*) = [{y^f/{y^)] Wl) 



(40) 



and 9* can be determined from ^* if {yi)'^l{y^) is inde- 
pendent of i, constant, and can be absorbed in 0* in a 
simple manner. 

If we again use the gamma-distributed power spectral 
data as example, we see that 



Pf{o*) = —Pf{e*j 



(41) 



i.e, we once more have a bias that scales with the number 
n of spectra averaged over, although this time the bias is 
half the size and positive. 

Closed- form expressions for /*„ and D* are straight- 
forward to obtain, when Pf is a Lorentzian [9] (reference 
[9] contains a typo: DT^^r should simply read D). The 
expression for /*„ given in [9] is un-biased, whereas the 
expression for D* has its bias removed by multiplying by 
n/{n -t- 1). Note, that apart from the n/(n -|- 1) factor on 
D, the results derived in [9] are mathematically identi- 
cal to those derived using an MLE approach in the next 
section. 



4- Constant values as weights ("un-weighted") 



3. Theoretical values as weights 

If the sample standard deviation is known to be pro- 
portional to fi{0*), it seems reasonable to weight the data 
by the theoretical value l/fi- After inserting Wi oc l//j 
in Eq. (19), the quantity to minimize is 



JV 



x'{0) = Y.ly^-fi{e)?l-\o) 



(36) 



When all weights are assumed to be equal, the function 
to minimize, 



JV 



x'{0) = ^[yi-fmf , (42) 

has stationarity equations which, for A'' —> 00 gives 
00 00 

f,{0l) defi{ei) = y^ 9ofi{e:) . m 
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Taking the expectation value on both sides, we have 

oo oo 

^fiK)defiK) = T.iy^)deM0*n) ■ (44) 

i=l i=l 

This equation may have the solution 

MO*n) = {yi) = Mn ■ (45) 

In other words, this estimator is unbiased. However, it is 

not a precise estimator; The stochastic errors on values 
it returns tend to be larger than those obtained with a 
weighted fit. Figures 2 and 3 illustrate this for the case 
of fitting an aliased Lorentzian to power spectra from 

optical tweezers. 
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FIG. 2: Stochastic error on fit parameters as a function of 
cut-off frequency, /max, for least squares fitting of aliased 

Lorcntzians using various weights, Eq. (19). For all fits the 
errors are substantial for low values of /max- Notice the loga- 
rithmic frequency scale: For largo /max, WLS returns a vari- 
ance nearly an order of magnitude smaller than for OLS (fit- 
ting with constant weights). The total number of acquired 
data points was held fixed aX N = 262, 144, and the fit was 
done to the average of n = 16 power spectra generated with 
fc = 500 Hz, D = 0.46 ^m/s^ and /sample = 16, 384 Hz. Sym- 
bols show the coeflicient of variation, s„{D)/D, with D mea- 
sured over 100 independent stochastic simulations. The coef- 
ficient of variation for /c show similar trends and are roughly 
two times larger at these settings (data not shown) . For OLS 
(triangles, Eq. (42)) the error is independent of /max for large 
frequencies because the information there is de-emphasized 
by the fitting algorithm. Using the experimental standard 
deviation or mean as weights (filled circles, Eq. (23)) leads 
to stochastic errors nearly as small as when using theoretical 
weights (empty circles, Eq. (36)) — the systematic errors (bi- 
ases) are twice as big and of the opposite sign however, see 
Fig. 4. 



5. Iterated values as weights 

When weights are determined iteratively, a fit is per- 
formed and the value returned for 9 is used in the weight 
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FIG. 3: Stochastic error on fit parameters as a function of 
number of power spectra averaged over, n, for least squares 
fitting of aliased Lorentzians using various weights, Eq. (19): 
Constant weights (triangles); experimental standard devia- 
tion (black circles); experimental average (grey circles); theo- 
retical weights (white circles). Thick black line through white 
circles show the theoretical prediction Eq. (E2). The number 
of acquired data points was held fixed at A'^ = 262, 144, and 
the fit was done to all the data, /max = /Nyq, for power spec- 
tra generated with fc = 500 Hz and D = 0.46 /um/s^. Sym- 
bols show the coefficient of variation, s„{D)/D, with D mea- 
sured over 1,000 independent stochastic simulations. Using 
theoretical weights clearly outperforms all the other weighing 
schemes for n < 32; for n > 32, the experimental weighing 
scheme performs comparably well — use of the experimental 
standard deviation as weight leads to slightly larger variation 
than use of the experimental average, as expected, and both 
show more variation than results obtained with theoretical 
weights. Results obtained with constant weights — sometimes 
rcfcrrc(i to as "no weights" or "un- weighted" — show roughly 
ten times larger variation than does results obtained with the- 
oretical weights, for all n. 

of a new fit and so on, until some convergence criterion 
has been met. The function to minimize with respect to 
for given e^'*^'^ reads 

N 

X^(^) = 5][y,-/,(e)]^«;,2(^(Her)) _ (4g) 

i=l 

It has the stationarity equations 

N N 

E fi wKO^'''^)defi = J2 ^iiO^''^''^) defi • (47) 

i=l i=\ 

We now use the same arguments as above together with 
the observation that at the fixed-point solution to this 
iterative scheme, ^^('*°'') = and both equal 0* in the 
limit iV — >■ oo. Taking the expectation value of the equa- 
tion after taking the limit N ^ oo, we see that ^* may 
satisfy 

MK) = iVi) = Wn , (48) 

in which case this estimator is unbiased. The estima- 
tion scheme just described is also known as "iteratively 
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FIG. 4: Systematic error (bias) on fit parameters as a function 
of number of power spectra averaged over, n, for least squares 
fitting of aliased Lorentzians using various weights, Eq. (19). 
The corner frequency is unbiased (data not shown), indepen- 
dent of n, whereas the diffusion coefficient shows a strong 
weighing-scheme dependent bias. The sampling frequency 
was held fixed at /sample = 16,384H[z, and the fit was done 
to the averaged power spectra generated with A'^ = 262, 144 
(ijjisr = 16 s), /c = 500 Hz, and D = 0.46 /^m/s^. Symbols 
show the signed error on the mean of D measured over 1,000 
independent stochastic simulations. For least squares fitting 
with constant weights (triangles) the bias on D is zero. Least 
squares fitting with experimental weights (black circles: stan- 
dard deviation; grey circles: average value) has a systematic 
error on D with — 2/n dependence (thin black line) for n > 2. 
Theoretical weights (white circles) lead to an 1 /n-dependence 
(thick black line) in the systematic error on D. 



reweighted least squares" and convergence is not guaran- 
teed. 



6. Error-bars 

Error-bars on the parameter estimates for 0* , can be 
calculated from the error-bars on the estimator 9n for ^* 
and the simple relationship in terms of n between 6* and 
6** , by propagation of errors (see Eq. (73)), provided the 
latter exists. The error bars on 9^ can be those returned 
by a least-squares fitting routine used to determine 9n, 
or they may be know theoretically. In that connection, 
note that the error-bars given in [9] for WLS with theo- 
retical weights are correct only in the limit n = oo, but 
by replacing D with n/ {n -\- 1)D they are correct for all 
n. 

When 9 is found from a 0„ which itself is found by 
minimizing , the precision with which is known de- 
termines the precision of the estimate 6. Because the 
various weights in have different stochastic fluctua- 
tions, we expect that theoretical weights, Eq. (36), will 
give smaller variation in 9 than experimental weights, 
Eq. (23). A small calculation shows that the variance of 
the stationarity equations, Eq. (24), around zero includes 
a term {pf''^^'^) oc {(n - l)(n - 2)(n - 3)(n - 4)}-\ i.e., 



the error-bars are expected to be large for n ^ 4, if ex- 
perimental weights are used. When theoretical weights 
are used, the error-bars are small and well-defined for all 
n. These results are verified by example, see Fig. 3. 

B. Results for power spectra 

We verified the above theoretical results by Monte- 
Carlo simulations of the OT and AFM power-spectra and 
found perfect agreement. Below, we list the specific re- 
sults. 

For the diffusion coefiicient D, we found that the out- 
come of least-squares fitting, £)('^i\ depends on the num- 
ber, n, of power spectra averaged over in the follow- 
ing manner for the aliased (see Fig. 4) and non-aliased 
Lorentzians, as well as the aliased and non-aliased AFM 
expressions (data not shown): 

1. D(i'^i) = D{n - T)ln, n > 2; if data-points 
are weighted with their experimental standard- 
deviation or mean, Wf = l/sniPf°^^) or w/ = 

weighting choice with the 

largest bias and with large stochastic errors for 
n < A. Here, D is underestimated. 

2. £)(^''i) = D{n + l)/n; if data-points are weighted 
using the theoretical expectation value for the 
standard-deviation, Wf = ypnjPj. This is the least 
squares fitting of [9]; a hybrid of MLE and LSQ, 
analytically tractable and thus very robust. It also 
has the smallest stochastic errors. Here, 
overestimates D. 

3. = £); if the weights are updated iteratively 
to the fitted value, w/ = l/Pj^*'*''. This is a prac- 
tically correct but computationally intensive and 
somewhat unstable approach; the initial guess must 
be close to the true value, or the algorithm will not 
converge. 

4. P)'^^'^^ = D; if all data-points are given the 
same weight, Wj = constant. This method de- 
emphasizes information available at higher frequen- 
cies because power spectral values have constant 
relative error, hence rapidly decreasing absolute er- 
ror beyond /c or /o, yet are treated as having con- 
stant absolute error. Consequently, this estimator 
has low precision, typically with two to ten times 
larger stochastic errors than the Cases 1 3 above. 
This is OLS fitting applied far outside its range of 
validity. 

Figures 2, 3, and 4 show examples of the size of these 
stochastic errors and biases, respectively, for fits of the 
aliased Lorentzian to data. 

We found that the expectation values of the fitted 

values for /c, /o, and Q were independent of weighing 
scheme and un-biased, whereas the standard deviations 
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varied by up to an order of magnitude depending on 
weighing scheme applied. Fiirthcrmore. fitting of non- 
aliased PSDs to aliased data introduces large systematic 
errors for the trivial reason that they fail to capture the 
shape of the PSD near the Nyquist frequency (data not 
shown). 

VI. MAXIMUM LIKELIHOOD ESTIMATION 
OF FIT PARAMETERS 



A. From non-linear to linear stationctrity equations 
via a simple trick 

Although we gave the solution to the full MLE prob- 
lem implicitly above, as the minimum of T Eq. (53), we 
can speed up the fitting process substantially (numerical 
optimization can be slow when the data-sets are large) 
and gain more insight into the problem by taking a few 
more analytical steps before turning to numerics. 



We now proceed to determine the fit-parameters by 
the method of maximum likelihood estimation. We also 
derive closed-form expressions for the fit-parameters in 
terms of the experimental data values. They should be 
useful for speedy online calibration. 

Both power spectra, given in Eqs. (3) and (4), are con- 
sequences of a linear, time-invariant dynamics driven by 
a white noise, Eq. (1). They are consequently of the form 

" a + bp+cf^ ' ^^^^ 

with c = in the case of the Lorentzian, and (a, b) or 
(a, 6, c) parameters to be fitted. This simple form, com- 
bined with the simple statistical properties of the exper- 
imental power spectral values, makes rigorous MLE of 
the parameters a straightforward numerical optimization 
problem as shown below. 

When fitting, we fit only to the positive-frequency part 
of the power spectrum, or to a subset of it. So here we 
considered only that part of the spectrum. Since P^*^^^ 

and Pp^^ are uncorrelated for f ^ f\ the probability 

density for the experimental spectrum Pj'^^\ given its 



(cx). 



(50) 



expectation value P/, is 

P(P(-)|P)=[]P„(P)' 
/ 

where p„ is given in Eq. (14). Thus Maximum Likelihood 
estimation of the theory's parameter values consists in 
choosing these parameters so they maximize p{P^'^^^\P) 
for given p('^^\ or, equivalently, minimize the negative 
logarithm 

^' = ^Y1 i^f"''^ /^f + ^f) + constant . (51) 



where 



constant = In r(n) — n In n — (n — 1) In P^' 



(ex) 



(52) 



is a constant with respect to (o, b, c) so it can be ignored 

when minimizing. We are then left with the task of find- 
ing the values of (a, 6, c) that minimize the cost function 



J^ia,b,c)^J2{Pt^/Pf + ^''Pf) ' 



(53) 



which is an uncomplicated optimization problem that can 

be solved numerically with standard programs. Good 
starting values are given in Eqs. (63) and (64). 



1. Results for Optical Tweezers 

For Pf given in Eq. (3), written as in Eq. (49) with 
c = 0, in Eq. (53) reads 



.F(a, 6) = ^ ((a + bf)Pp''^ - ln(a + bf 



(54) 



It is minimized with respect to a and b when these pa- 
rameters satisfy the stationarity condition 

f f f ■' 



a + bp 



(55) 



These are non-linear equations for a and b. However, 
from Eq. (15) we know that we can write 

^' = 5r^ = iW<""'^'- '''' 

Substituting for Pf in Eq. (55) we find 

E^r = ;r^E(^^^^^')(« + ^/') 

E/'^r = -^E/'<^^^'''')(«+^/') '(57) 

which is linear in a and b. 

Before solving for a and b we introduce the statistic 



(58) 



with K the number of terms in the sum, and the statistic 
obeys \\vt\K^oo Sp^q = (-^p.^)- 

The sums should only include those frequencies the 
user deems relevant, i.e., frequencies corresponding to 
mechanical or electronic resonances can be excluded, 
and high/low frequency cut-offs can be applied. That 
is, the statistics can be trimmed iteratively if required: 
Power spectral values too far from a fit can be iden- 
tified and excluded from the sums, after which a new 
fit is found, et cetera until a steady state is reached 
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and all power spectral values satisfy the user-defined ac- 
ceptance criterion. When no frequencies are excluded, 
nK = N = tmsr/sampiei the total number of data ac- 
quired. In this latter case, one fits the power spectrum 
all the way out to the Nyquist frequency, and aliasing 
should be taken into account (see appendices); unless 
aliasing is eliminated by over-sampling data acquisition 
electronics. 

With this notation, Eq. (57) can be written in matrix 
form 



('S'0,2) ("Si, 2) 

{Sl,2) ('S'2,2) 



(59) 



with solution 



1 + 1/n 



Sn,i{S2,2) 

Sl,l {So,2) 



Sis {Sl,2) 

5'o,i('S'i,2) 
(60) 

In this expression, we know Sp^q from the experiment. 
However, we can always write Sp^q 



but not (Sp^q). 
(Sp^q) +SSp,q and substitute for (S'p,g' 
SSp^q it is easy to show that {SSp^q) 



inEq. (60). About 
and 



{{SSp,qr) = 



1 

1 

K 



T{n + 2q) fT{n + q) 



n29r(n) 



n'?r(ri) 



.2q 



[weak n-dep.] S'2p,29 



f 

(61) 



where S is defined as in Eq. (58) except for Pf = (p)''''^) 
replacing Pj°^\ Above, g = 2 in the relevant expres- 
sions and therefore Sp^q scales as 1 /K for p = 0,1 and 
is independent of K ioi p = 2. Thus, apart from some 
n-depcndcncc, the variance of Sp_q scales as or 1/K 

both of which are very small numbers in a typical experi- 
ment, where K = N/n is of order 10^ 10^'. It is therefore 
an excellent approximation to replace {Sp^q) with Sp^q in 
Eq. (60) 



So,2S: 



1 + 1/n / So,iS2,2 - 5i,i5i,2 

■ \ 'S'l,l'S'o,2 — 'S'o,l5'l,2 

(62) 

where now (a, b) are our estimates of (a, b). By comparing 
Eqs. (3) and (49) we then have 



/c 

D 



a\i/2 
b) " 
27r^ 27r^ 
b ~ b 



1/2 



2TT^n 



5*1, 15*0,2 

5o,252.2 - 



- 5o,i5: 
02 



1,2 



n + 1 5i i5o 2 ~ 5o,i5i 2 



1/2 
(63) 

(64) 



2. Results for AFM cantilevers 

Identical reasoning leads, in the case of c 7^ 0, to the 
three coupled linear equations for a, 6, and c 



(65) 




5l,2 


5*2,2 \ 






52,2 


53,2 1 




« (1 + 1/n) 


53,2 


54,2 / 








which are inverted to give 





(66) 



Comparing Eqs. (4) and (49) then gives 

1/4 



1/4 



(^) 



D = 

2 



27r^ 



27r^ 



6 + 2(oc)i/2 S + 2(ac)V2 
c c 



6 + 2(ac)i/2 S + 2(ac)i/2 



(67) 

(68) 

(69) 



For completeness and ease of implementation we give 
the inversion formulas: = C/P, 



V 



5o, 252, 254, 2 — 5o, 25-52 

5i 954,2 + 25i,252,253,2 — 52 



Co Ci C2 
Ci C5 C3 
C2 C3 C4 



(70) 



(71) 



and 



Co 


— 52,254,2 


- S'2 

•^3,2 




= 52,253,2 


— 5i,254,2 


C2 


= 5i,253,2 


•^2,2 




= 5i,252,2 


— 50,25*3,2 


C4 


= 5o,252,2 


'-'1,2 


c. 


= 5o,254,2 


•^2,2 • 



(72) 



Of course, it is also possible to numerically invert 
Eq. (65) and insert the resulting values for d,b,c in 
Eqs. (63,64,67,68,69). The above results however, should 
resolve potential issues with numerical stability arising 
from the inversion of near-singular matrices. 



B. Examples of Fitting tiie power spectra 

Figure 5A shows the average of n = 16 power spectra 
for the Brownian motion of a micro-sphere in an optical 
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trap. The data are synthetic, computer generated; see 
[9]. Thus, the values for /c and D are known exactly, 
and can be used as benchmarks for results obtained by 
fitting, with no worry about any of the complicating cir- 
cumstances that can affect data in the real world. The 
red line is the expectation value of the aliased Lorentzian, 
taking as input the exactly known values of /c and D. 
The yellow line is the aliased Lorentzian corresponding 
to the stochastically realized parameter values for /c and 
D, determined by rigorous MLE of (a, h) with no use of 
our simplifying trick. The dashed blue line is the result 
of MLE, with the use of our simplifying trick. All of these 
three lines plot virtually on top of each other. Finally, the 
black lines shows the result of least squares fitting of an 
aliased Lorentzian to the data, using as weights: (i) The 
standard deviations on the n = 16 spectra with a result- 
ing — 2/n = —12.5% systematic error on D, see Eq. (32); 
and (ii) Theoretical weights, resulting in a 1/n = 6.25% 
systematic error on Z?, see Eq. (41). 

Note, that for clarity of presentation we discuss the 
non-aliased cases in the main-text and give closed-form 
aliased results in Appendices B and C. Numerical tests 
of the theory are done using the aliased theory in order to 
utilize all the available data: When fitting a non-aliased 
expression to aliased data it is necessary to introduce a 
cut-off frequency /max ^ /Nyq and discard all data above 
it. 

Figure 5B shows the same data and fits in a normal- 
ized residual plot, i.e., data or fits minus the true ex- 
pected value (the residue) divided by the true expected 
value. Thus, deviation from zero in this plot shows by 
how much data and fits differ from the true expected 
value, measured in units of the true expected value. 
The data should scatter about with standard deviation 
n""'^/^ — 16^^/^ ~ 25%, whereas the fits should simply 
trace zero for all frequencies. Here, the true expected 
value is known because we use synthetic data; we nor- 
malize by it to show the results of all fits in a single 
figure. In an experiment, the true expected value is not 
known and the fitted value is used instead — when a bias 
is present the residues will consequently differ from zero 
in a systematic manner. 

This bias always hides in the scatter of the data in a 
figure like Fig. 5A, because < n~^^^ for n = 1, 2, . . .. 
But, the bias is substantial for small to intermediate val- 
ues of n and will reveal itself in a plot like Fig. 5B. 

Figure 6 is similar to Fig. 5, except its power spectrum 
describes the confined Brownian motion of a massive par- 
ticle, e.g. an AFM cantilever in air, and the fits do not 
take aliasing into account. The data are synthetic, com- 
puter generated; see Appendix A. 



C. Error-bars on fit parameters 

Because we have derived closed-form expressions for 
the fit parameters we can also calculate expressions for 
the expected error-bars on the fit parameters. We do 




012345678 
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FIG. 5; Various fits of aliased Lorentzian to synthetic power 
spectrum for microsphere in optical trap. Data are artifi- 
cial, with known parameter values fc = 500 Hz and D = 
0.46 /^m^/s, /sample ~ 16, 384 Hz, and imsr = 16 s. A: Data 
and aliased Lorentzians. PSD values (grey line) are averages 
over 16 spectra. Note, how the three Lorentzians correspond- 
ing to, respectively, the exactly known values of fc and D 
(red line), rigorous numerical ML-fit (yellow line), and an- 
alytical ML-fit using simplifying trick (dashed blue line) all 
plot on top of each other. In contrast, the Lorentzian from 
numerical least-squares fits with experimental or theoretical 
weights (thick and thin black lines, respectively) are offset 
by -12.5% and -1-6.25% due to their systematic errors on D. 
B: Residual plots, i.e., same data and fits as shown in Panel 
A, but divided by the known true expectation value, then 
unity subtracted. The agreement between data and the ML- 
fits is practically perfect whereas the least-squares fits clearly 
under/over-estimate the PSD. 



that and compare to the theoretical limit for how small 
the error-bars can be. 

Quite generally, irrespective of whether we study the 
aliased or non-aliased Lorentzian or the AFM PSD, the 
error-bars on the fit-parameters are found by propagating 
the errors on a, 6, and c using the generic formula for a 
function z (calculating the differential Az and squaring 
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FIG. 6: Various fits of Eq. (4) to tlie average of 16 power 
spectra for an AFM cantilever. Data are synthetic with 
known parameter values /o = 15 kHz, D = 0.0010 /im^/s, and 
{^f = 6169 Ms^ /sample = 65, 536 Hz, and W = 8 s. A: 
Data and fits. Note, how the three curves corresponding to, 
respectively, the exactly known parameters (red line), rigor- 
ous numerical ML-fit (yellow line), and analytical ML-fit us- 
ing simplifying trick (dashed blue line) all plot on top of each 
other. In contrast, the result of a numerical least-squares fits 
with experimental or theoretical weights (thick and thin black 
lines, respectively) are offset by -12.5% and -1-6.25% due to 
their systematic errors on D. In all the fits, only frequencies 
below 20kHz were included. B: Residual plots, i.e., same as 
shown in Panel A, but divided by the exactly known expecta- 
tion value, then unity subtracted. At the highest frequencies, 
where the non-aliased fits start to deviate from the aliased 
PSD, a slight departure from a straight line is seen in all the 
fits. 



ance matrix 

((Aa)2) (AaA6) (AaAc) 

cov(a,6,c)=( (AaA6) {{Mf) (A&Ac) | . (74) 

(AaAc) (A6Ac) ((Ac)^) 

This covariance matrix can be written in terms of the 
expectation values for the statistics, see Appendix D, 

1 n -|- 3 

cov(a,5,c) ~ cov(d,6,c) = — (S)^"^ (75) 



N n 



with 



(S) 



n + lg^ 



(76) 



where S is a matrix of the same form as S in Eq. (65), 
but with expectation values Pf = {P^'^^^) replacing the 
experimental values P^®'') in the statistics Eq. (58), see 
Eq. (F3). 

To find out how much of the available information we 
are putting to use, we can express the covariance matrix 
in terms of the Fisher information matrix I = A^S for 
the full (not the trick) MLE problem 



cov(a, b, c) w cov(d, b, c) 



n + 3 



(77) 



from which we see that as n increase, we approach the 
Cramer-Rao bound [15] on the variance for an unbiased 
estimator, which is just I^^. The main thing to notice, 
however, is that all the covariances in Eq. (75) are very 
small because N is very large — the weak n dependence 
(see Fig. 7) is of no consequence in comparison. In an 
experiment, simply insert the fitted value for the power- 
spectrum as a best estimate for in S. 



1. Results for Optical Tweezers 

For the non-aliased Lorentzian in Eq. (3) we then have, 
using the generic Eq. (73) 



2 r 



((Aa) 



((A6)2) ^{AaAb) 



fe2 



ah 



(78) 



a'{z[a, b, c]) = ((Az)2) = (73) a'(D) = D^^^^ , (79) 

{d^zf {{Aa)') + {d,z)' ((A6)2) + {d,z)' {{Ac)') 

+ 2dazdbz{AaAb) + 2dazdcz{AaAc) + 2dbZ dcz{AbAc) irrespective or the method used to estimate a, b, and 

c. Actual numbers are found by inserting the estimates 
with daZ — dz/da, and similarly for b and c; z = (d, &, c). The results for the aliased Lorentzian are given 
/c, /o, D, ■ ■ ■, and (AaA6) etc. are elements of the covari- in Appendix E. 



13 



1.2r 



ra 0.8 



5 0.6 



Simulated fc 
Simulated D 
Analytical fc 
Analytical D 
[(n+3)/(n+1)]^' 



O 0.4 



0.2 L 



16 32 64 128 256 
n 



FIG. 7: Stochastic error on fit parameters as a function of 
number, n, of PSDs averaged over, for ML-fits of aliased 
Lorentzians using simplifying trick. Compared to the de- 
pendence on the measurement time (Fig. 8), the error is 
only weakly dependent on n. The sampling frequency was 
held fixed at /sample ~ 16, 384 Hz, and the fit was done to 
the averaged power spectra generated with A'^ — 262, 144 
(W = 16s), fc = 500Hz, and D = 0.46Atm/s^ Filled 
symbols show the coefficient of variation, a{X)/ (X) , with 
X = fc,D determined using Eqs. (C2) and (C3), from 100 
independent stochastic simulations. Empty symbols show 
the theoretical expectation values from Eqs. (El) and (E2). 
Dashed lines show the ^(n + 3)/(n + 1) € [1 : V2] scaling 
predicted from Eqs. (77) and (E5). 



2. Results for AFM cantilevers 

For the AFM, the variance of the three fit parameters 
are: 
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FIG. 8: l/v'iV-dependence of stochastic error on fit parame- 
ters for ML-fits of aliased Lorentzians using simplifying trick. 
An expected 1 / \/A behavior is easily made out on the double- 
logarithmic scale. The sampling frequency was held fixed 
at /sample = 16, 384 Hz, and the fit was done to the aver- 
age of n = 16 power spectra generated with fc = 500 Hz 
and D — 0.46 /im/s'^. Filled symbols show the coefficient 
of variation, a{X)/ {X) , with X — fc,D determined using 
Eqs. (C2) and (C3), from 100 independent stochastic simula- 
tions. Empty symbols connected by lines show the theoretical 
expectation values from Eqs. (El) and (E2). 



the support, or backing, is 

B{so) = erfc (|so - K\^/n/(2Kfj (84) 
where erfc is the complementary error function, sq = 

j(ex) 



- 2- 



ac 



(80) 



where 



47r4 



[((Aa)2)/-^ + ((A6)2) + ((Ac)^)/o^ 



+ 2 {AaAb)fo^ + 2(AaAc) + 2{AbAc)f^] 

= ^ [{{Aaf)fo-' + ((A6)2) + {{AcfKG- 

+ 2 (AaA6)/o-2 + 2(AaAc)(l - G-Vq"^) 

+ 2(A6Ac)(G-i-/2)] , 



J2f Pf /-P/) ^iid K is the number of terms in the sum 

So- We have assumed above that this number is much 
larger than the number of parameters fitted, hence equal 
to the number of degrees of freedom. It is of order 10^- 
10^ in our case, while 2-3 parameters arc fitted. When 
''the sum sq = K, the expectation value for so, the back- 
ing is one but it rapidly drops to zero as sq becomes 
larger or smaller than K . The backing calculated for 
the fits shown in Fig. 5 were, respectively, (Isq fits; zero 
(^81)within the numerical precision of MatLab), 0.87 (ML- 
fit with simplifying trick), and 1.00 (rigorous numerical 
ML-fit; the first deviation from unity is in the 7th dec- 
/q)^ imal place). These numbers are stochastically varying 
because the PSD values are. The backing for the rigor- 
ous numerical ML-fit will always be close to one because 
(82)the stationarity conditions, see Eq. (53), for which the 
fit parameters are determined are virtually the same as 
so = K. 



G = . 



(83) 



VII. SUMMARY AND CONCLUSION 



3. How good are the fits? 

The goodness of fit, i.e., the support for the hypothesis 
that the fitted theory is correct, is [16] the probability 

that a repetition of the experiment yields a data set with 
a smaller value for p. A calculation shows that for K ^ 1 



1 . A time series of N coordinate values for an optically 

trapped microsphere or an AFM cantilever doing 
Brownian motion, gives rise to N power spectral 
values {N/2+1 distinct values) with signal-to-noise 
ratio 1 (Section IV). Parameters characterizing the 
power spectrum can consequently be determined 
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with stochastic errors of order 1/\/N in the ahased 
case (Section VIC and Appendix E). For the non- 
aliased case see also Fig. 2. 

2. For the purpose of displaying the experimental 

power spectrum, its signal-to-noise value is re- 
duced by a factor y/n by dividing the original data, 
the time series of N coordinates, into n equally 
long, non-overlapping subseries (Section IV). From 
these, n experimental power spectra are calculated 
and averaged over. This noise-reduced power spec- 
trum covers the same frequency interval, but the 
separation A/ between consecutive points has in- 
creased by a factor n. 

3. Noise reduction trades resolution on the frequency 
axis for resolution on the power axis in a man- 
ner that loses no information about the parameters 
characterizing the power spectrum. 

4. The \l\pn scatter of experimental power spectral 
values should not be confused with the expected 
stochastic error on parameter values characteriz- 
ing the power spectrum; the latter are of order 

with only a weak n-dependence, as shown 
in Eq. (75), and Figs. 8 and 7. 

5. Formulas given above, based on maximum likeli- 
hood estimates, eliminate this issue and all further 
fitting for the cases of the non-aliased and aliased 
Lorentzian power spectrum and for the non-aliased 
damped harmonic oscillator as model for an AFM 
cantilever (Section VIA). Fitting has been done 
once-and-for-all and the results, including crrorbars 
and a goodness-of-fit measure, are given by the for- 
mulas in Section VI and Appendix C. 

6. For other systems described by a linear Langevin 
equation driven by white or colored noise (Sec- 
tion IV A), one can determine parameters of the 
theoretical power spectrum by fitting it to an ex- 
perimental spectrum, using weighted least-squares 
fitting and experimental or theoretical error bars as 
weights, which is computationally faster and more 
robust than MLE. The resulting value for the diffu- 
sion coefficient D should be corrected as described 
in Eqs. (32) and (41) respectively. 

7. Quite generally, beyond optical traps and AFM 
cantilevers, if the dependent variable (the data) and 
the squared weights are independent, then weighted 
least-squares yields unbiased results. If not, the 
least-squares fit is typically biased, but this bias 
can often be removed altogether by a simple re- 
scaling of the fit-results (Section V). 

8. To minimize the stochastic error on fit parameters 
we suggest setting n = 8 or larger if using ML-fits 
with our simplifying trick or WLS with theoreti- 
cal weights, see Eqs. (77) and (E5) and Fig. 7. If 
WLS with experimental weights is used, we suggest 



n = 16 or larger, see Fig. 3. Also, for given /c, n, 
and TV, /sample = 8/c will minimize the stochas- 
tic error, see Fig. 9. Typically however, tmsr a-nd /<. 
are fixed and /sample should simply be set as high as 
meaningfully possible as N is the most important 
factor in increasing the precision, see Fig. 8. 

9. For optical tweezers, the highest precision is ob- 
tained by fitting over as much of the frequency 
range as can be captured by theory, including mod- 
ification to the PSD due to hydrodynamics, optics, 
and instrumentation [9, 10, 13]. If a certain level 
of precision is required it then becomes a matter 
of measuring long enough with the sampling fre- 
quency as high as experimentally possible (the min- 
imum in Fig. 9 is for fixed N). 

10. When doing least squares fitting we recommend the 

use of theoretical weights, Eq. (36), if the standard 
deviation is known to be proportional to the ex- 
pected value, because this minimizes the stochastic 
errors on the fit parameters, see Figs. 2 and 3. If 
any bias is present, that will also be smaller than 
the bias resulting from experimental weights, and 
can often be eliminated with the help of Eq. (40) 
or (41). 




1024 2048 4096 8192 16384 32768 65536 131072 262144 



sample 

FIG. 9: Stochastic error on fit parameters as a function of 
sampling frequency, /sample, for ML-fits of aliased Lorentzians 
using simplifying trick. A non-monotonic behavior is seen 
with minimum around /sample = 8/c. The number of acquired 
data points was licld fixed at = 262, 144, and the fit was 
done to the average of n = 16 power spectra generated with 
/c = 500 Hz and -D = 0.46 /im/s^. Filled symbols show the co- 
efficient of variation, a{X)/{X), with X = fc,D determined 
using Eqs. (C2) and (C3), from 100 independent stochastic 
simulations. Empty symbols connected by lines show the the- 
oretical expectation values from Eqs. (El) and (E2). 
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Appendix A: Monte Carlo simulation of the 
Einstein-Ornstein-Uhlenbeck theory of Brownian 
motion in a harmonic potential 

In the case of non-negligible mass m, Eq. (1) is rewrit- 
ten as two coupled first-order equations, 



i/2:l ( 



where we have introduced the 2x2 matrix 



M 



-1 

m m 



(Al) 
(A2) 



Equation (Al) has the solution 



v{t) 



from which follows that 



Xj + i 



MAt I \ 



Axj 



Here 
where 



-A_c+ + A+c_ —c-^+c- 
A+ - A_ V - C-) A+C+ - A_c_ 



(A3) 
(A4) 

(Vs) 



A±-^±\/A-- ."±-f+? ) (A6) 
2m V 4m2 m V "^^^ J 

arc the two eigenvalues and corresponding eigenvectors 
of M, 



c± = cxp(— A±At) 
and we have introduced the notation 



Av. 



with 



(A7) 
(A8) 



Ax±j = (2D)V2 ^+ + ^- f df'e-^±(*^+i-*')ry(t') . 
''^-1- - ^- Jtj 

(A9) 

Prom Eq. (2) follows that Ax±^j are two random lengths 
drawn from Gaussian distributions with vanishing expec- 
tation value and known variances: 

{Ax±^jAx±^k) = cf'i 6j^k , (AlO) 

It also follows that Ax+j and Ax-j are correlated with 
each other, but uncorrelated with all Ax±^k for j ^ k: 



(All) 



a±_ =2D 



A+ + A_\M-c 



A+-A_y A+-hA_ 



From their known correlation follows, after some calcu- 
lation, that they can be expressed in terms of two uncor- 
related Gaussian distributed random numbers with unit 
variance, r/j ' and r/^ , as 

Axi 



3 _ 



(A12) 



A, 



-1 



-A_ 
1 



1/2 Jb) 
j 



where we have introduced the notation 



A+ + A_ {l-cl)D 



A+- A_ 



2A± 



and 



Ta^a: 



'^+ + ^-^(i-c^)(i-c2) ■ 



(A13) 



(A14) 



So iteration of Eq. (A4) with use of Eq. (A12) generates 
a time scries of positions Xj, which is sampled cquidis- 
tantly in time with separation At from the continuous- 
time solution to Eq. (1). Since we use the exact ana- 
lytical solution of Eq. (1) in the generation of this se- 
ries, the finite value of At causes no discretization error. 
The only numerical errors associated with our solution 
are associated with the representation of real numbers 
on a computer, and, rather hypothetical, with the use of 
pseudo-random numbers. 



Appendix B: Alisised AFM Power Spectrum 

For the AFM, using the results from Appendix A, we 
get for the aliased power spectrum: 

-,(cx) 



At (Bl) 



where 



a+ = l + cl-2c+cos{2TTk/N) (B2) 
a_ = l-l-c^ -2c_cos(27rfc/Af) (B3) 
a+_ = 1-1- c+c_ - (c+ + C-)cos{2nk/N) (B4) 



and we have used that the discrete Fourier transform of 
77^"^ and rjj'^ have the following characteristics 

(r)i">r)f)) = (B5) 

(^i"^*^l"^> = (^r*^f^> = UsrAtSk, . (B6) 
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For completeness we note that the expression in Eq. (Bl) 
has as hmiting expression Eq. (CI) when the mass van- 
ishes: 



hm {Pr'} 
m->-0 " a 



(B7) 



as is seen by inspection. 



Appendix C: Maximum Liicelihood Estimation for 
aliased power spectra 



Wc do not attempt here to give the ahased results for 
fo,D, and G from the AFM case: To avoid the alias- 
ing of high frequency noise to the lower frequencies of 
interest, a high sampling frequency is often used when 
acquiring AFM data. However, only the region around 
/o is well captured by the dampened harmonic oscillator 
theory and therefore no more than this region is fitted. 
Since the aliased expressions only deviate substantially 
from the non-aliased ones at high frequencies, and be- 
cause the non-aliased expressions are much simpler, we 
only treated the non-aliased MLE for the AFM here. 



In an OT experiment, the time-series of bead positions 
x{t) is obtained by sampling the continuous output from 
the photodiode at discrete times tj = jAt, At = 1/fmsr- 
Applying the discrete Fourier transform to x{t) we find [9] 
that the expectation value for the aliased power spectrum 
can be written in the form: 



j^alias 



1 



A + B cos{2Trk/N) ' 
where A and B are related to /c and D through 

/sample 



fc 

D 



2tt 

/sample 

- ^ 

A tanh('u) ' 
u = cosh~^{-A/B) . 



(CI) 

(C2) 

(C3) 
(C4) 



By inserting Eq. (CI) in Eq. (53) the stationarity con- 
ditions {OaJ^ = ObJ^ = 0) are seen to be 



(C5) 



Y^cos{2-Kk/N)pf'''^ = ^cos(27rfc/7V)P/ . (C6) 



/ 



We now repeat the trick introduced in Section VIA to 
turn Eqs. (C5) and (C6) into expressions linear in A and 
B 



Ra,2 Ri,2 

Rl,2 -^2,2 
R 



(i+i/n)(2;;; 



(C7) 



that are solved to give 



A 
B 



A-: 

B 



n i?0, 2^^2,2 — R\,2 

n+1 R0.2R1S ^ Ro,iRi,2 



n -Ro,2^2,2 — R1.2 
where we have introduced the (aliased) statistics 



(C8) 
(C9) 

(CIO) 



Appendix D: Covariance Matrix 

This appendix derives the results that are used in Sec- 
tion VI C and Appendix E. To calculate the covariance 
matrix we look at the response of the estimated fit pa- 
rameters to fluctuations in the statistics Sp^q. The calcu- 
lations go through unchanged for the aliased Lorentzian 
(see above) with statistics Rp^q- First, we note that we 
can write each term in Eq. (65) 



SvkSv = S\ b \={l + l/n)s (Dl) 

as the sum of its "true underlying" value and a fluctua- 
tion (AS and As) or a response {Av) to fluctuations 



S = (S)-FAS 

V = V + Av 
s = {s}+As 



where the elements in AS are 
1 



(D2) 
(D3) 
(D4) 



(D5) 



To first order in the fiuctuations we thus have 

'n+1 



As- ASi; 



(D6) 



where we notice that to first order (Av) = which, how- 
ever, does not mean that this is an unbiased estimator 
as shown below. The covariance matrix {Av (g) Av) as 
given in Eq. (74) then follows after some calculation us- 
ing Eqs. (49) and (15). 

Wc emphasize here, that the above calculations were 
to first order in with K the number of terms in 

the statistics Sp^qi Whereas the relative size of an indi- 
vidual fluctuation in the power spectral value AP^™^ is 
independent of K, the relative sizes of the overall fluctu- 
ations in the sums ASp^q go to zero as 1/VK. Since K 
is typically of the order 10''-10^, this first-order approx- 
imation is very good. 
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Appendix E: Error-bars for the aliased Lorentzian 

The error-bars on /c and D are calculated as before, 
using Eq. (73) for the variance: 



/sample 

47r2(A2 
AB 



((AA)2) ^ ((AS)2) 



(El) 



and 



cj\D) = {dADf{{AAf) + {dBDf{{ABf) 



+ 2dADdBD{AAAB) 



where 



B^ 



D 



A KA^-B"^ 



D 



A 



uVA^ - 52 

^2 



B VWA2-S2 A2_52 



(E2) 

(E3) 
(E4) 



and u is given in Eq. (C4). The covariance matrix is 
calculated as before, giving 



cov(A, B) 



( {{AAf) {AAAB) 
\ {AAAB) {{ABf) 



1 n- 



Nn+l 



(E5) 



where R is a matrix with the same structure as R, sec 
Eq. (C7), but with the experimental values P^®'^) replaced 
by the theoretical (in practice the fitted) values Pf in all 
the statistics Eq. (CIO). 

We tested the above analytical results by comparing to 
the results of simulations: Multiple independent position 
time-series for a mass-less particle diffusing in a harmonic 
potential were created using the methods given in [9]. 
The resulting power spectra were fittcxl using Eqs. (C2- 
ClO). Since we simulate a stochastic process there is 
scatter in the fitted parameters and it is this scatter that 
we compare to the results given in Eqs. (El) and (E2). 
The results are shown in Figs. 8 and 7. Figure 8 shows 
the expected 1/VN scaling, whereas Fig. 7 shows a 
^t/n + ij^Jn + 1 scaling. 

Figure 9 shows that the optimal tradeoff between preci- 
sion and amount of data acquired seems to manifest itself 
at a sampling frequency roughly eight times the corner 
frequency; any slower than this leads to large errors in 
both parameters because the PSD essentially reduces to 
the ratio of D to /c, i.e., two parameters are used to fit 
a single constant. Sampling much faster than /c, but 
keeping the number of acquired data points fixed, has no 
effect on the error on D but is detrimental for /c since 
there is progressively less information about /c at larger 
frequencies. The effect on the precision of increasing n 
is small but positive; compared to the effect of N and 



/sample it can be ignored (after including it as described 
in Eqs. (C8) and (C9)). 

For comparison, the error-bars on the fit parameters, 
as a function of cut-off frequency /max, from a least- 
squares fit are shown in Fig. 2: The average of n = 16 
synthetic PSDs were fitted by minimizing Eq. (23) with 
the data-points weighted by the standard deviation of 
the n PSDs. Also shown is a LSQ fit where the weights 
are kept constant; this is the kind of fitting performed 
by primitive LSQ routines. For a detailed discussion 
of how the stochastic error depends on the fitting range 
[/min : /max] the reader is referred to section VIII in [9]. 



Appendix F: Bias of the MLEs 

Obviously, we must be paying a price somewhere for 
turning a non-linear problem into a linear one with our 
little trick, or else we would have turned a non-linear 
problem into an exactly solvable mathematical problem. 
That is sometimes done, but not here: The trick works 
through an approximation, and the resulting approxi- 
mate estimator is biased, which means that its expecta- 
tion value is different from the true value of the quantity 
it estimates. So on the average it misses the correct re- 
sult. Bias is systematic error on averages. That is the 
nature of the price we pay. Fortunately, it is negligible 
in size, as we demonstrate now. 

For the non-aliased Lorentzian and AFM we find, by 
expanding Eq. (Dl) to first order in Av and second order 

in ASri.a 




(Fl) 



(AS(S)-^AS)z 



n 



-(AS(S)-iAs) 



For the non-aliased Lorentzian this expression can be re- 
duced to 



82,2 —Si^2 \ ( ^2,3 —2Si^3 ^o;. 



Sl,2 So^2 ) V "SSiS —252,3 S\;, 




where 



/ / 

That is, the bias is proportional to 1/A'' which is a very 
small number, and displays a weak n-dcpendence. From 
these expressions we find the bias on /c and D to be 

(A6) 



{AD) = -D 



(F5) 
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We only know the true values of fc and D in simula- 
tions, in an experiment the best estimate of the true value 
would be the fitted value. 

For the aliased Lorentzian we find in an analogous 
manner 



(A/c) 
(AD) 



/sample 



A 



(AS) _ (AA) 
B A 



2tt y/A^ - 

B^ A 

f{AA) ^ {AB)\ {AA) 



(F6) 



-D 



A 



A 



(F7) 



where (AA) and (AB) are found from Eq. (Fl) by ev- 
erywhere replacing S with R. The result of a numerical 
test of the above relations is shown in Fig. 10. 
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FIG. 10: Bias of the parameter estimates for fc (grey squares) 
and D (white circles), from ML-fits with our trick, of the 
aliased Lorentzian. Error-bars are standard errors on the 
mean. The point of this figure is only to show that the bias is 
indeed very small and can be completely ignored. Theoretical 
expectation value for the bias as given in Eq. (F6) (grey line) 
and Eq. (F7) (black line) arc less than 1/1, 000 of a percent for 
these settings. Bias was measured as the difference between 
the average of 1,000 determinations of the fit parameters and 
the known input values fc = 500 Hz and D = 0.46 /im/s^. 
Simulations were run with N = 2^° and /sample = 2^^ Hz — 
values chosen to minimize the stochastic errors. Data were 
treated with n non-overlapping Hann windows before calcu- 
lation of the PSD. 
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